Skip to content

feat(telemetry): roll up http requests#946

Open
EhabY wants to merge 4 commits into
mainfrom
feat/issue-907-http-requests-rollup
Open

feat(telemetry): roll up http requests#946
EhabY wants to merge 4 commits into
mainfrom
feat/issue-907-http-requests-rollup

Conversation

@EhabY
Copy link
Copy Markdown
Collaborator

@EhabY EhabY commented May 8, 2026

Summary

  • Add http.requests telemetry rollups for active (method, route) buckets. Every 60 seconds the aggregator emits one event per active bucket with status counts, network errors, avg duration, and p95.
  • Normalize routes against an explicit rule table. Trailing literal segments pass through; unknown routes are emitted verbatim so high-cardinality endpoints get caught instead of silently masked.
  • Wire telemetry into the primary and remote-scoped CoderApi clients. Throwaway clients pass NOOP_TELEMETRY_REPORTER and never schedule a timer.

Closes #907.

Validation

  • pnpm test:extension ./test/unit/settings/telemetry.test.ts ./test/unit/logging/httpRequestsTelemetry.test.ts ./test/unit/api/coderApi.test.ts
  • pnpm test:extension
  • pnpm format:check
  • pnpm lint
  • pnpm typecheck
Implementation
  • HttpRequestsTelemetry aggregates per (method, route) and emits the rollup every 60s. Tracks the actual window start so a partial flush on dispose() reports a usable elapsed value. Skips the timer entirely when constructed with the NOOP reporter so throwaway clients don't leak it.
  • Route normalization is a flat table of explicit rules (api/v2/workspaces/{id}, api/v2/users/{name}/keys/{id}, etc.). No UUID or numeric parsing: if a new endpoint needs masking, add a rule.
  • TelemetryReporter interface and NOOP_TELEMETRY_REPORTER decouple CoderApi from the concrete TelemetryService.
  • Tests cover same-bucket rollup, status buckets, empty windows, p95, route normalization (including fallback and undefined / "" inputs), dispose-flush, the NOOP timer guard, and the axios interceptor wiring.

Generated by Coder Agents.

@EhabY EhabY self-assigned this May 8, 2026
@EhabY
Copy link
Copy Markdown
Collaborator Author

EhabY commented May 11, 2026

/coder-agents-review

Copy link
Copy Markdown

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good addition. The tumbling-window aggregation design is proportional to the problem, the TelemetryReporter interface extraction keeps coupling narrow, and the NOOP sentinel pattern for throwaway clients is clean. The test harness is well-structured with fake timers, meaningful assertions, and non-vacuous coverage.

Severity summary: 1 P2, 13 P3, 1 P4, 5 Nit, 2 Note.

The P2 is the UUID regex version nibble constraint ([1-5]), which rejects UUIDv6/v7/v8 in the fallback normalizer. Seven reviewers flagged it independently. The fix is one character.

Of the P3s, the most impactful cluster: dispose() drops accumulated data without flushing, the timer callback has no try/catch (a throw permanently kills the chain), and average()/percentile95() produce NaN/undefined on empty input. All are straightforward defensive fixes.

On the design side: the route normalization fallback passes name-bearing tail segments (like usernames after /members/) through unchanged, which partially contradicts the stated goal of stripping user/workspace names. And the interceptor registration order means telemetry records phantom errors for requests that are transparently retried by the cert-refresh or auth interceptors. Neither is urgent, but both are worth acknowledging.

One reviewer questioned whether the windowSeconds config knob earns its ~115 lines of schema, reader, watcher, update, and tests given nobody will change the default. Worth a human decision.

"A public surface with zero in-tree callers is a foot-gun, not dead code." (Kite, on the UUID fallback path)

🤖 This review was automatically generated with Coder Agents.

Comment thread src/logging/httpRequestsTelemetry.ts Outdated
Comment thread src/logging/httpRequestsTelemetry.ts
Comment thread src/logging/httpRequestsTelemetry.ts
Comment thread src/logging/httpRequestsTelemetry.ts
Comment thread src/logging/httpRequestsTelemetry.ts Outdated
Comment thread src/api/coderApi.ts
Comment thread src/api/coderApi.ts Outdated
Comment thread src/logging/httpRequestsTelemetry.ts
Comment thread CHANGELOG.md Outdated
Comment thread test/unit/logging/httpRequestsTelemetry.test.ts
EhabY and others added 3 commits May 11, 2026 15:53
Drop the recorder interface, noop sentinels, and creation helper —
construct HttpRequestsTelemetry unconditionally and let the noop
reporter swallow .log() calls. Flatten route normalization rules,
switch buckets to a nested map (no string key encoding), and stop
pushing zero-duration samples when metadata is absent so p95 isn't
skewed. Drop the LOCAL_SINK_SETTING backwards-compat alias.
@EhabY EhabY force-pushed the feat/issue-907-http-requests-rollup branch from cd6bd55 to 64f09d1 Compare May 11, 2026 12:53
EhabY added a commit that referenced this pull request May 11, 2026
- UUID regex matches any version (v6/v7/v8 now normalize). Then dropped
  UUID/numeric heuristics entirely; route normalization uses explicit
  rules only. Added api/v2/workspaces/{id}/builds/{id} for the main
  composite case. Trailing literal segments still pass through.
- Flush the pending bucket on dispose so the last partial window is
  not lost.
- Drop the coder.telemetry.local.httpRequests.windowSeconds knob and
  hardcode 60s, removing the schema, reader, watcher, update method,
  and related tests.
- Rename addLoggingInterceptors -> addRequestInterceptors. Document
  that cert-refresh retries re-enter the chain so each attempt is
  recorded in telemetry.
- Add explicit rule for /api/v2/organizations/{id}/members/{name}.
- Use toSorted, drop the unused export, comment normalizeByRule.
- Drop "aggregated" from the CHANGELOG entry.
- Tests: dispose-flush, no-rule fallback, undefined / "" inputs.
@EhabY
Copy link
Copy Markdown
Collaborator Author

EhabY commented May 11, 2026

/coder-agents-review

Copy link
Copy Markdown

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strong response to R1. All 19 addressed findings verified: UUID regex eliminated entirely (class-of-bug fix), config surface removed, dispose flushes, try/finally protects the timer chain, empty-array guards return 0, naming and docs cleaned up. The normalization redesign (position-based rules, no UUID regex) is cleaner than what it replaced.

DEREM-20 (contested, interceptor ordering): panel closed 3/3 accept. The author's defense holds: each interceptor-level attempt is a real network event, cert-refresh frequency is bounded, and the added comment at coderApi.ts:520-522 documents the counting semantics.

Severity summary for new findings: 1 P2, 4 P3.

The P2 is a regression from the R1 refactor: removing the NOOP sentinel check means every CoderApi.create() now constructs a real HttpRequestsTelemetry with a running 60-second timer, including 5-6 throwaway call sites that never dispose. The fix is straightforward: lazy-start the timer on first record() instead of in the constructor.

"What happens when the extension runs for a week in a remote window?" (Hisoka, on the timer leak)

🤖 This review was automatically generated with Coder Agents.

Comment thread src/logging/httpRequestsTelemetry.ts Outdated
Comment thread src/logging/httpRequestsTelemetry.ts
Comment thread src/logging/httpRequestsTelemetry.ts Outdated
Comment thread src/telemetry/reporter.ts
Comment thread src/api/coderApi.ts
- Drop UUID and numeric heuristics from route normalization. Use
  explicit rules only; add api/v2/workspaces/{id}/builds/{id} for the
  main composite path. Trailing literal segments still pass through.
- Add a rule for api/v2/organizations/{id}/members/{name}.
- Flush the pending bucket on dispose so the last partial window is
  not lost. Wrap the flush in try/finally so a throw still clears
  state.
- Track the start of each window and emit the actual elapsed seconds
  in the http.requests event so consumers can compute a real rate.
- Skip the timer when constructed with NOOP_TELEMETRY_REPORTER so
  throwaway CoderApi clients don't leak it.
- Drop the coder.telemetry.local.httpRequests.windowSeconds knob and
  hardcode 60s, removing the schema, reader, watcher, update method,
  and related tests.
- Rename addLoggingInterceptors to addRequestInterceptors. Note that
  cert-refresh retries re-enter the chain, so each attempt is counted.
- toSorted, drop the unused export, document the public surface.
- Drop "aggregated" from the CHANGELOG entry.
- Tests: dispose-flush, NOOP timer-leak guard, no-rule fallback,
  undefined / "" inputs.
@EhabY EhabY force-pushed the feat/issue-907-http-requests-rollup branch from 4cc8b77 to 22fa9eb Compare May 11, 2026 15:10
@EhabY EhabY requested a review from mtojek May 11, 2026 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Telemetry: http.requests rollup

1 participant